Processor and system architectures that feature multiple memory controllersare prone to show bottlenecks and erratic performance numbers on codes withregular access patterns. Although such effects are well known in the form ofcache thrashing and aliasing conflicts, they become more severe when memoryaccess is involved. Using the new Sun UltraSPARC T2 processor as a prototypicalmulti-core design, we analyze performance patterns in low-level and applicationbenchmarks and show ways to circumvent bottlenecks by careful data layout andpadding.
展开▼